Skip to content

API: PeriodIndex subtraction to return object Index of DateOffsets #20049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 29, 2018

Conversation

jbrockmendel
Copy link
Member

Implements _sub_period_array in DatetimeIndexOpsMixin. Behavior is analogous to subtraction of Period scalar.

cls=type(self).__name__))

if not len(self) == len(other):
raise ValueError("cannot add indices of unequal length")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed it, but is there a test that hits this? And should it say "subtract" instead of "add"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catches on both counts. Will fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed with fix to add-->subtract. test_pi_sub_pi, test_pi_sub_pi_with_nat, test_pi_sub_isub_pi each go through this method, though I don't think any go through this particular line

@@ -710,6 +710,7 @@ Other API Changes
- ``Categorical.fillna`` now validates its ``value`` and ``method`` keyword arguments. It now raises when both or none are specified, matching the behavior of :meth:`Series.fillna` (:issue:`19682`)
- ``pd.to_datetime('today')`` now returns a datetime, consistent with ``pd.Timestamp('today')``; previously ``pd.to_datetime('today')`` returned a ``.normalized()`` datetime (:issue:`19935`)
- :func:`Series.str.replace` now takes an optional `regex` keyword which, when set to ``False``, uses literal string replacement rather than regex replacement (:issue:`16808`)
- :class:`PeriodIndex` subtraction of another ``PeriodIndex`` will now return a numeric ``Index`` instead of raising a ``TypeErrorr`` (:issue:`20049`)
Copy link
Member

@jschendel jschendel Mar 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra "r" at the end of "TypeErrorr"

@jreback
Copy link
Contributor

jreback commented Mar 8, 2018

I think we should be returning an array (Index?) of offsets rather than the integers which are untyped.

we don't really support this so an object Index would be ok I think.

@jreback jreback added Frequency DateOffsets Period Period data type labels Mar 8, 2018
@jbrockmendel
Copy link
Member Author

I think we should be returning an array (Index?) of offsets rather than the integers which are untyped.

I more-or-less agree, but ATM Period - Period returns an integer and PeriodIndex - Period returns an integer array. These three operations should all be consistent.

@jreback
Copy link
Contributor

jreback commented Mar 10, 2018

I more-or-less agree, but ATM Period - Period returns an integer and PeriodIndex - Period returns an integer array. These three operations should all be consistent.

and these should changed (both). yes its an API breaking change but much more intuitve / better. it will just have to be an object array but that's ok.

@jbrockmendel
Copy link
Member Author

jbrockmendel commented Mar 18, 2018

and these should changed (both). yes its an API breaking change but much more intuitve / better. it will just have to be an object array but that's ok.

Implementing this, the non-obvious issue that comes up is what to do with NaTs in a PeriodIndex when subtracting Period. Never mind, this is easy to handle.

@codecov
Copy link

codecov bot commented Jun 4, 2018

Codecov Report

Merging #20049 into master will increase coverage by <.01%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20049      +/-   ##
==========================================
+ Coverage    91.9%    91.9%   +<.01%     
==========================================
  Files         153      153              
  Lines       49549    49565      +16     
==========================================
+ Hits        45537    45554      +17     
+ Misses       4012     4011       -1
Flag Coverage Δ
#multiple 90.3% <94.44%> (ø) ⬆️
#single 41.76% <11.11%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/datetimelike.py 96.8% <94.44%> (-0.09%) ⬇️
pandas/util/testing.py 85.27% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1ffc5f...0886917. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good. rebase, small comments. @jschendel can you give a once-over

@@ -996,6 +996,7 @@ Other API Changes
- :func:`DataFrame.to_dict` with ``orient='index'`` no longer casts int columns to float for a DataFrame with only int and float columns (:issue:`18580`)
- A user-defined-function that is passed to :func:`Series.rolling().aggregate() <pandas.core.window.Rolling.aggregate>`, :func:`DataFrame.rolling().aggregate() <pandas.core.window.Rolling.aggregate>`, or its expanding cousins, will now *always* be passed a ``Series``, rather than a ``np.array``; ``.apply()`` only has the ``raw`` keyword, see :ref:`here <whatsnew_0230.enhancements.window_raw>`. This is consistent with the signatures of ``.aggregate()`` across pandas (:issue:`20584`)
- Rolling and Expanding types raise ``NotImplementedError`` upon iteration (:issue:`11704`).
>>>>>>> cbec58eacd8e9cd94b7f42351b8de4559c250909
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

@@ -740,6 +740,17 @@ def test_sub_period(self, freq):
with pytest.raises(TypeError):
p - idx

@pytest.mark.parametrize('op', [operator.add, ops.radd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add (maybe another test) add/sub with a pi and another dtype (I don't think we have this test?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, you mean to add cases over in the period test_arithmetic file, not here, right?

rng = pd.period_range('1/1/2000', freq='D', periods=5)
other = pd.period_range('1/6/2000', freq='D', periods=5)

# previously performed setop union, now raises TypeError (GH14164)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a pretty old comment, can you improve

@@ -258,6 +258,44 @@ def test_comp_nat(self, dtype):


class TestPeriodIndexArithmetic(object):
# ---------------------------------------------------------------
# __add__/__sub__ with PeriodIndex

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give a 1-2 line on what are allowed ops on PI

# TODO needs to wait on #13077 for decision on result type
def test_pi_sub_isub_pi(self):
# GH#20049
# previously raised TypeError (GH#14164), before that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you refresh comment

@jreback jreback added this to the 0.24.0 milestone Jun 5, 2018
@pep8speaks
Copy link

pep8speaks commented Jun 6, 2018

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 22, 2018 at 03:51 Hours UTC

@jbrockmendel
Copy link
Member Author

Removing test_pi_sub_pi because it is entirely redundant with test_pi_sub_isub_pi. Moving the latter to where it should be in the test file.

@jorisvandenbossche jorisvandenbossche changed the title implement add/sub for period_dtype other API: PeriodIndex subtraction to return object Index of DateOffsets Jun 12, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't first the Period scalar substraction be changed as well?

@jbrockmendel
Copy link
Member Author

@jorisvandenbossche yes, thats in #21314.

@@ -68,6 +68,7 @@ Datetimelike API Changes

- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with non-``None`` ``freq`` attribute, addition or subtraction of integer-dtyped array or ``Index`` will return an object of the same class (:issue:`19959`)
- :class:`DateOffset` objects are now immutable. Attempting to alter one of these will now raise ``AttributeError`` (:issue:`21341`)
- :class:`PeriodIndex` subtraction of another ``PeriodIndex`` will now return an object-dtype :class:`Index` of :class:`DateOffset` objects instead of raising a ``TypeErrorr`` (:issue:`20049`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra "r" at the end of "TypeErrorr"

def test_pi_sub_pi_with_nat(self):
rng = pd.period_range('1/1/2000', freq='D', periods=5)
other = rng[1:].insert(0, pd.NaT)
assert other[1:].equals(rng[1:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assert necessary? I don't immediately see why it's needed, but could be missing something. If it is needed, can assert_index_equal be used instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking that I haven't already screwed up, i.e. that the test below is indeed testing what I think it is testing.

@jbrockmendel
Copy link
Member Author

gentle ping (this should be tandem with #21314)

@jreback jreback merged commit a3e56f2 into pandas-dev:master Jun 29, 2018
@jreback
Copy link
Contributor

jreback commented Jun 29, 2018

thanks @jbrockmendel

also pls have a look at the period docs to see if we need updating anywhere.

@jbrockmendel jbrockmendel deleted the per_array branch April 5, 2020 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Frequency DateOffsets Period Period data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH/API: Decide what to return Period - Period subtraction
5 participants